Imagination Improves Multimodal Translation

نویسندگان

  • Desmond Elliott
  • Ákos Kádár
چکیده

Multimodal machine translation is the task of translating sentences in a visual context. We decompose this problem into two sub-tasks: learning to translate and learning visually grounded representations. In a multitask learning framework, translations are learned in an attention-based encoderdecoder, and grounded representations are learned through image representation prediction. Our approach improves translation performance compared to the state of the art on the Multi30K dataset. Furthermore, it is equally effective if we train the image prediction task on the external MS COCO dataset, and we find improvements if we train the translation model on the external News Commentary parallel text.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Manipulation As an Ideological Tool in the Persian Translations of Ervand Abrahamian’s The Coup: A Multimodal CDA Approach

The present Critical Discourse Analysis (CDA) study aimed to explore the probable ideological manipu- lations exerted in three translations of an English political book entitled The Coup by Ervand Abraha- mian. This comparative qualitative study was conducted based on Farahzad‘s three-dimensional CDA model. The textual, paratextual, and ...

متن کامل

Modulating and attending the source image during encoding improves Multimodal Translation

We propose a new and fully end-to-end approach for multimodal translation where the source text encoder modulates the entire visual input processing using conditional batch normalization, in order to compute the most informative image features for our task. Additionally, we propose a new attention mechanism derived from this original idea, where the attention model for the visual input is condi...

متن کامل

Zero-Resource Neural Machine Translation with Multi-Agent Communication Game

While end-to-end neural machine translation (NMT) has achieved notable success in the past years in translating a handful of resource-rich language pairs, it still suffers from the data scarcity problem for low-resource language pairs and domains. To tackle this problem, we propose an interactive multimodal framework for zero-resource neural machine translation. Instead of being passively expos...

متن کامل

Improving On-line Handwritten Recognition using Translation Models in Multimodal Interactive Machine Translation

In interactive machine translation (IMT), a human expert is integrated into the core of a machine translation (MT) system. The human expert interacts with the IMT system by partially correcting the errors of the system’s output. Then, the system proposes a new solution. This process is repeated until the output meets the desired quality. In this scenario, the interaction is typically performed ...

متن کامل

Using Images to Improve Machine-Translating E-Commerce Product Listings

In this paper we study the impact of using images to machine-translate user-generated ecommerce product listings. We study how a multi-modal Neural Machine Translation (NMT) model compares to two text-only approaches: a conventional state-of-the-art attentional NMT and a Statistical Machine Translation (SMT) model. User-generated product listings often do not constitute grammatical or well-form...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017